Network- Aware Byzantine Broadcast 
in Point-to-Point Networks using Local Linear Coding * 



Guanfeng Liang and Nitin Vaidya 
Department of Electrical and Computer Engineering, and 
Coordinated Science Laboratory 
University of Illinois at Urbana-Champaign 
gliang2@illinois.edu, nhv@illinois.edu 



Abstract 

Lhe goal of Byzantine Broadcast (BB) is to allow a set of fault-free nodes to agree on 
information that a source node wants to broadcast to them, in the presence of Byzan- 
tine faulty nodes. We consider design of efficient algorithms for BB in synchronous 
point-to-point networks, where the rate of transmission over each communication link 
is limited by its "link capacity". Lhe throughput of a particular BB algorithm is defined 
as the average number of bits that can be reliably broadcast to all fault-free nodes 
per unit time using the algorithm without violating the link capacity constraints. Lhe 
capacity of BB in a given network is then defined as the supremum of all achievable BB 
throughputs in the given network, over all possible BB algorithms. 

We develop NAB - a Network- Aware Byzantine broadcast algorithm - for arbitrary 
point-to-point networks consisting of n nodes, wherein the number of faulty nodes is 
at most f, f < n/3, and the network connectivity is at least 2f + 1. We also prove 
an upper bound on the capacity of Byzantine broadcast, and conclude that NAB can 
achieve throughput at least 1/3 of the capacity. When the network satisfies an addi- 
tional condition, NAB can achieve throughput at least 1/2 of the capacity. 

Lo the best of our knowledge, NAB is the first algorithm that can achieve a constant 
fraction of capacity of Byzantine Broadcast (BB) in arbitrary point-to-point networks. 
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National Science Foundation award 1059540. Any opinions, findings, and conclusions or recom- 
mendations expressed here are those of the authors and do not necessarily reflect the views of the 
funding agencies or the U.S. government. 



1 Introduction 



The problem of Byzantine Broadcast (BB) - also known as the Byzantine Generals problem 
|T2| - was introduced by Pease, Shostak and Lamport in their 1980 paper l|T9ll . Since the first 
paper on this topic, Byzantine Broadcast has been the subject of intense research activity, due to its 
many potential practical applications, including replicated fault-tolerant state machines [5J, and 
fault-tolerant distributed file storage |201. Informally, Byzantine Broadcast (BB) can be described 
as follows (we will define the problem more formally later). There is a source node that needs to 
broadcast a message (also called its input) to all the other nodes such that even if some of the nodes 
are Byzantine faulty, all the fault-free nodes will still be able to agree on an identical message; the 
agreed message is identical to the source's input if the source is fault-free. 

We consider the problem of maximizing the throughput of Byzantine Broadcast (BB) in syn- 
chronous networks of point-to-point links, wherein each directed communication link is subject to 
a "capacity" constraint. Informally speaking, throughput of BB is the number of bits of Byzantine 
Broadcast that can be achieved per unit time (on average), under the worst-case behavior by the 
faulty nodes. Despite the large body of work on BB [7 , 6 , 3l|Tll|2l[l^, performance of BB in arbitrary 
point-to-point network has not been investigated previously. When capacities of the different 
links are not identical, previously proposed algorithms can perform poorly. In fact, one can easily 
construct example networks in which previously proposed algorithms achieve throughput that is 
arbitrarily worse than the optimal throughput. 

Our Prior Work: In our prior work, we have considered the problem of optimizing throughput of 
Byzantine Broadcast in 4-node networks [14] . By comparing with an upper bound on the capacity 
of BB in 4-node networks, we showed that our 4-node algorithm is optimal. Unfortunately, the 
4-node algorithm does not yield very useful insights on design of good algorithms for larger 
networks. This paper presents an algorithm that uses a different approach than that in [14J, 
and also develops a different upper bound on capacity that is helpful in our analysis of the new 
algorithm. In other related work, we explored design of efficient Byzantine consensus algorithms 
when total communication cost is the metric (which is oblivious of link capacities) lfT5| . 

Main contributions: This paper studies throughput and capacity of Byzantine broadcast in arbi- 
trary point-to-point networks. 

1. We develop a Network- Aware Byzantine (NAB) broadcast algorithm for arbitrary point-to- 
point networks wherein each directed communication link is subject to a capacity constraint. 
The proposed NAB algorithm is "network-aware" in the sense that its design takes the link 
capacities into accoimt. 

2. We derive an upper bound on the capacity of BB in arbitrary point-to-point networks. 

3. We show that NAB can achieve throughput at least 1/3 of the capacity in arbitrary point- 
to-point networks. When the network satisfies an additional condition, NAB can achieve 
throughput at least 1/2 of the capacity. 

We consider a synchronous system consisting of n nodes, named 1, 2, • • • , n, with one node desig- 
nated as the sender or source node. In particular, we will assume that node 1 is the source node. 
Source node 1 is given an input value x containing L bits, and the goal here is for the source to 
broadcast its input to all the other nodes. The following conditions must be satisfied when the 
input value at the source node is x: 
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• Termination: Every fault-free node i must eventually decide on an output value of L bits; let 
us denote the output value of fault-free node / as y,. 

• Agreement: All fault-free nodes must agree on an identical output value, i.e., there exists y 
such that y, = y for each fault-free node i. 

• Validity: If the source node is fault-free, then the agreed value must be identical to the input 
value of the source, i.e., y - x. 

Failure Model: The faulty nodes are controlled by an adversary that has a complete knowledge 
of the network topology, the algorithm, and the information the source is trying to send. No secret 
is hidden from the adversary. The adversary can take over up to / nodes at any point during 
execution of the algorithm, where f < n/3. These nodes are said to he faulty. The faulty nodes can 
engage in any kind of deviations from the algorithm, including sending incorrect or inconsistent 
messages to the neighbors. 

We assume that the set of faulty nodes remains fixed across different instances of execution of 
the BB algorithm. This assumption captures the conditions in practical replicated server systems. 
In such a system, the replicas may use Byzantine Broadcast to agree on requests to be processed. 
The set of faulty (or compromised) replicas that may adversely affect the agreement on each request 
does not change arbitrarily. We model this by assuming that the set of faulty nodes remains fixed 
over time. 

When a faulty node fails to send a message to a neighbor as required by the algorithm, we 
assume that the recipient node interprets the missing message as being some default value. 

Network Model: We assume a synchronous point-to-point network modeled as a directed simple 
graph ^("V, £), where the set of vertices "V = {1, 2, • • • , n} represents the nodes in the point-to-point 
network, and the set of edges S represents the links in the network. The capacity of an edge 
e € £ is denoted as z^. With a slight abuse of terminology, we will use the terms edge and link 
interchangeably, and use the terms vertex and node interchangeably. We assume that n > 3f + l and 
that the network connectivity is at least 2f + 1 (these two conditions are necessary for the existence 
of a correct BB algorithm |[7|). 

In the given network, links may not exist between all node pairs. Each directed link is asso- 
ciated with a fixed link capacity, which specifies the maximum amount of information that can be 
transmitted on that link per unit time. Specifically, over a directed edge e - (i, f) with capacity 
Ze bits/unit time, we assume that up to ZgX bits can be reliably sent from node i to node / over 
time duration t (for any non-negative t). This is a deterministic model of capacity that has been 
commonly used in other work lll3l l4ll9l lT0| . All link capacities are assumed to be positive integers. 
Rational link capacities can be turned into integers by choosing a suitable time unit. Irrational link 
capacities can be approximated by integers with arbitrary accuracy by choosing a suitably long 
time unit. Propagation delays on the links are assumed to be zero (relaxing this assumption does 
not impact the correctness of results shown for large input sizes). We also assume that each node 
correctly knows the identity of the nodes at the other end of its links. 

Throughput and Capacity of Byzantine Broadcast 

When defining the throughput of a given BB algorithm in a given network, we consider Q > 1 
independent instances of BB. The source node is given an L-bit input for each of these Q instances, 
and the validity and agreement properties need to be satisfied for each instance separately (i.e., 
independent of the outcome for the other instances). 
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For any BB algorithm denote t{Q, L, Q, J?l) as the duration of time required, in the worst case, 
to complete Q instances of L-bit Byzantine Broadcast, without violating the capacity constraints of 
the links in Throughput of algorithm J?l in network Q for L-bit inputs is then defined as 

Tig,L,^) = lim 



We then define capacity Cbb as follows. 



Capacity Cbb of Byzantine Broadcast in network is defined as the supremum over the through- 
put of all algorithms J?l that solve the BB problem and all values of L. That is, 

Cbb{&) = supT{g,L,^). (1) 



2 Algorithm Overview 

Each instance of our NAB algorithm performs Byzantine broadcast of an L-bit value. We 
assume that the NAB algorithm is used repeatedly, and during all these repeated executions, the 
cumulative number of faulty nodes is upper bounded by /. Due to this assumption, the algorithm 
can perform well by amortizing the cost of fault tolerance over a large number of executions. 
Larger values of L also result in better performance for the algorithm. The algorithm is intended 
to be used for sufficiently large L, to be elaborated later. 

The A:-th instance of NAB executes on a network corresponding to graph Qk^k> ^k)r defined as 
follows: 



• For the first instance, fc = 1, and Q\ = Q. Thus, "Vi ='V and 6i = fi. 

• The k-th instance of NAB occurs on graph in the following sense: (i) all the fault-free nodes 
know the node and edge sets 'Vk and S^, (ii) only the nodes corresponding to the vertices 
in 'Vk need to participate in the k-th instance of BB, and (iii) only the links corresponding 
to the edges in 6/^ are used for communication in the A:-th instance of NAB (communication 
received on other links can be ignored). 

During the Ic-th instance of NAB using graph ^jt, if misbehavior by some faulty node(s) 
is detected, then, as described later, additional information is gleaned about the potential 
identity of the faulty node(s). In this case, Qk+i is obtained by removing from 0k appropriately 
chosen edges and possibly some vertices (as described later). 

On the other hand, if during the fc-th instance, no misbehavior is detected, then Qk+i = &k- 

The fc-th instance of NAB algorithm consists of three phases, as described next. The main contri- 
butions of this paper are (i) the algorithm used in Phase 2 below, and (ii) a performance analysis 
of NAB. 

If graph ^k does not contain the source node 1, then (as will be clearer later) by the start of the 
A:-th instance of NAB, all the fault-free nodes already know that the source node is surely faulty; 
in this case, the fault-free nodes can agree on a default value for the output, and terminate the 
algorithm. Hereafter, we will assume that the source node 1 is in Q^. 
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(a) (b) 
Figure 1: Example graphs 




(a) Directed graph G 




(c) Two unit-capacity spanning trees 
in the directed graph. Every directed 
edge has capacity 1 



Figure 2: Different graph representations of a 
capacities. 




(d) A spanning tree in the undi- 
rected graph shown in dotted edges 



;. Numbers next to the edges indicate link 



Phase 1: Unreliable Broadcast 



In Phase 1, source node 1 broadcasts L bits to all the other nodes in ^]^. This phase makes no 
effort to detect or tolerate misbehavior by faulty nodes. As elaborated in Appendix|Al unreliable 
broadcast can be performed using a set of spanning trees embedded in graph Qj^. Now let us 
analyze the time required to perform unreliable broadcast in Phase 1. 

MINCUT{^](, 1, i) denotes the minimum cut in the directed graph Q^^ from source node 1 to 
node y. Let us define 

7fc = min MINCUng^Xi). 

MINCUT{0](, 1, j) is equal to the maximum flow rate possible from node 1 to node ; € 'V]^. It is 
well-known [17j that y^t is the maximum rate achievable for unreliable broadcast from node 1 to 
all the other nodes in 'Y^^, under the capacity constraints on the links in £jf. Thus, the least amount 
of time in which L bits can be broadcast by node 1 in graph is given b}|3 

L / Yk (2) 

Clearly, yj. depends on the capacities of the links in Q]^. For example, if were the directed graph 
in Figure [1(a)} then MINCUT{g^, 1, 2) = MINCUT{g^, 1, 4) = 2, MlNCUTiQ^, 1, 3) = 3, and hence 



n = 2. 

At the end of the broadcast operation in Phase 1, each node should have received L bits. At the 
end of Phase 1 of the /c-th instance of NAB, one of the following four outcomes will occur: 

(i) The source node 1 is fault-free, and all the fault-free nodes correctly receive the source node's 
L-bit input for the k-\h instance of NAB, or 

(ii) The source node 1 is fault-free, but some of the fault-free nodes receive incorrect L-bit values 
due to misbehavior by some faulty node(s), or 

(iii) The source node 1 is faulty, but all the fault-free nodes still receive an identical L-bit value in 
Phase 1, or 

(iv) The source node is faulty, and all the fault-free nodes do not receive an identical L-bit value 
in Phase 1. 



The values received by the fault-free nodes in cases (i) and (iii) satisfy the agreement and validity 
conditions, whereas in cases (ii) and (iv) at least one of the two conditions is violated. 

Phase 2: Failure Detection 

Phase 2 performs the following two operations. As stipulated in the fault model, a faulty node 
may not follow the algorithm specification correctly. 

• (Step 2.1) Equality check: Using an Equality Check algorithm, the nodes in •y^. perform a 
comparison of the L-bit value they received in Phase 1, to determine if all the nodes received 
an identical value. The source node 1 also participates in this comparison operation (treating 
its input as the value "received from" itself). 

^To simplify the analysis, we ignore propagation delays. Analogous results on throughput and 
capacity can be obtained in the presence of propagation delays as well. 
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Section |3] presents the Equality Check algorithm, which is designed to guarantee that if the 
values received by the fault-free nodes in Phase 1 are not identical, then at least one fault-free 
node will detect the mismatch. 

• (Step 2.2) Agreeing on the outcome of equality check: Using a previously proposed Byzantine 
broadcast algorithm, such as [19 J, each node performs Byzantine broadcast of a l-hit flag to 
other nodes in indicating whether it detected a mismatch during Equality Check. 

If any node broadcasts in step 2.2 that it has detected a mismatch, then subsequently Phase 3 
is performed. On the other hand, if no node armounces a mismatch in step 2.2 above, then Phase 
3 is not performed; in this case, each fault-free node agrees on the value it received in Phase 1, and 
the k-th instance of NAB is completed. 

We will later prove that, when Phase 3 is not performed, the values agreed above by the fault- 
free nodes satisfy the validity and agreement conditions for the fc-th instance of NAB. On the other 
hand, when Phase 3 is performed during the fc-th instance of NAB, as noted below. Phase 3 results 
in correct outcome for the fc-th instance. 

When Phase 3 is performed. Phase 3 determines Qk+i- Otherwise, Qj^+i = Q^- 

Phase 3: Dispute Control 

Phase 3 employs a dispute control mechanism that has also been used in prior work |T, TSl. 
Appendix |B] provides the details of the dispute control algorithm used in Phase 3. Here we 
summarize the outcomes of this phase - this summary should suffice for understanding the main 
contributions of this paper. 

The dispute control in Phase 3 has very high overhead, due to the large amount of data that 
needs to be transmitted. From the above discussion of Phase 2, it follows that Phase 3 is performed 
only if at least one faulty node misbehaves during Phases 1 or 2. The outcomes from Phase 3 
performed during the /c-th instance of NAB are as follows. 

• Phase 3 results in correct Byzantine broadcast for the /c-th instance of NAB. This is obtained 
as a byproduct of the Dispute Control mechanism. 

• By the end of Phase 3, either one of the nodes in 'Y^ is correctly identified as faulty, or/and 
at least one pair of nodes in "V/f, say nodes a, b, is identified as being "in dispute" with each 
other. When a node pair a, b is found in dispute, it is guaranteed that (i) at least one of these 
two nodes is faulty, and (ii) at least one of the directed edges {a,b) and {b,a) is in Sj^. Note 
that the dispute control phase never finds two fault-free nodes in dispute with each other. 

• Phase 3 in the k-th instance computes graph Qk+i ■ In particular, any nodes that can be inferred 
as being faulty based on their behavior so far are excluded from 'Vk+i', links attached to such 
nodes are excluded from Sj^+i . In Appendix [B] we elaborate on how the faulty nodes are 
identified. Then, for each node pair in "Vk+i, if that node pair has been found in dispute at 
least in one instance of NAB so far, the links between the node pair are excluded from £)t+i • 
Phase 3 ensures that all the fault-free nodes compute an identical graph ^j^+i = ('y)t+i,£j:+i) 
to be used during the next instance of NAB. 

Consider two special cases for the A;-th instance of NAB: 

• If graph does not contain the source node 1, it implies that all the fault-free nodes are 
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aware that node 1 is faulty. In this case, they can safely agree on a default value as the 
outcome for the A:-th instance of NAB. 

• Similarly, if the source node is in but at least / other nodes are excluded from Qj^, that 
implies that the remaining nodes in are all fault-free; in this case, algorithm NAB can be 
reduced to just Phase 1. 

Observe that during each execution of Phase 3, either a new pair of nodes in dispute is identified, 
or a new node is identified as faulty. Once a node is found to be in dispute with / + 1 distinct 
nodes, it can be identified as faulty, and excluded from the algorithm's execution. Therefore, 
Dispute Control needs to be performed at most /(/ + 1) times over repeated executions of NAB. 
Thus, even though each dispute control phase is expensive, the bounded number ensures that the 
amortized cost over a large number of instances of NAB is small, as reflected in the performance 
analysis of NAB (in Section|5]and Appendix ID)) . 

3 Equality Check Algorithm with Parameter 

We now present the Equality Check algorithm used in Phase 2, which has an integer parameter 
Pk for the k-th instance of NAB. Later in this section, we will elaborate on the choice of pk, which 
is dependent on capacities of the links in ^jt- 

Let us denote by x, the L-bit value received by fault-free node i € in Phase 1 of the k-th 
instance. For simplicity, we do not include index k in the notation X/. To simplify the presentation, 
let us assume that L/p^ is an integer. Thus we can represent the L-bit value Xi as pk symbols from 
Galois Field GF(2^^P'*). In particular, we represent x, as a vector Xi, 

Xi = K(l), X,(2),--- ,XKp,c)] 

where each symbol X,(;) 6 GF(2^''''*) can be represented using L/p^ bits. As discussed earlier, for 
convenience, we assume that all the link capacities are integers when using a suitable time unit. 

Algorithm 1 Equality Check in with parameter p^ 
Each node i e should performs these steps: 

1. On each outgoing link e = (z, ;) 6 Sk whose capacity is Zg, node i transmits Zg linear combi- 
nations of the pk symbols in vector Xi, with the weights for the linear combinations being 
chosen from GFil^'P^). 

More formally, for each outgoing edge e = (/, /) e Sk of capacity Zg, a p^ x matrix Cg is 
specified as a part of the algorithm. Entries in Ce are chosen from GF{2^^P''). Node / sends to 
node / a vector Ye of Zg symbols obtained as the matrix product Ye = XiCg. Each element of 
Ye is said to be a "coded symbol". The choice of the matrix Ce affects the correctness of the 
algorithm, as elaborated later. 

2. On each incoming edge d = (j, i) e Sk, node i receives a vector Yj containing z^ symbols from 
GF{2^IP'^). Node i then checks, for each incoming edge d, whether Y^ = XjCj. The check is 
said to fail iff Yd t XiCa. 

3. If checks of symbols received on any incoming edge fail in the previous step, then node / sets 
a l-h\i flag equal to MISMATCH; else \heflag is set to NULL. This flag is broadcast in Step 2.2 
above. 
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In the Equality Check algorithm, Zg symbols of size L/pj. bits are transmitted on each link e of 
capacity Zg. Therefore, the Equality Check algorithm requires time duration 

I I Pk (3) 

Salient Feature of Equality Check Algorithm 

In the Equality Check algorithm, a single round of communication occurs between adjacent 
nodes. No node is required to forward packets received from other nodes during the Equality 
Check algorithm. This implies that, while a faulty node may send incorrect packets to its neighbors, 
it cannot tamper information sent between fault-free nodes. This feature of Equality Check is 
important in being able to prove its correctness despite the presence of faulty nodes in Qj^. 

Choice of Parameter pj^ 

We define a set Qk as follows using the disputes identified through the first {k - 1) instances of 
NAB. 

Q;c = { H I H is a subgraph of containing (n - /) nodes such that no two nodes in H 
have been found in dispute through the first (k - 1) instances of NAB } 

As noted in the discussion of Phase 3 (Dispute Control), fault-free nodes are never found in dispute 
with each other (fault-free nodes may be found in dispute with faulty nodes, however). This implies 
that includes all the fault-free nodes, since a fault-free node will never be found in dispute with 
/ -I- 1 other nodes. There are at least n- f fault-free nodes in the network. This implies that set Q.^ 
is non-empty. 

Corresponding to a directed graph H{V, E), let us define an undirected graph H{V, E) as follows: 
(i) both H and H contain the same set of vertices, (ii) undirected edge {i, j) e £ if either {i, j) e E 
or (;', i) € E, and (iii) capacity of undirected edge {i, j) e E is defined to be equal to the sum of the 
capacities of directed links (z, j) and {j, i) in E (if a directed link does not exist in E, here we treat its 
capacity as 0). For example. Figure [2(b)] shows the undirected graph corresponding to the directed 



graph in Figure 2(a) 



Define a set of undirected graphs Q;c as follows. Qk contains undirected version of each directed 
graph in Q. 

Qfc = {H\HeClk} 

Define = t^^j^^q^ min. ■^j^MINCUT{H, i, j) as the minimum value of the MINCUTs between all 

pairs of nodes in all the undirected graphs in the set 0;^. For instance, suppose that n = 4, / = 1 and 
the graph shown in Figure [T(a)] is whereas is the graph shown in Figure [T(b)j Thus, nodes 2 
and 3 have been found in dispute previously. Then, Clj^ and each contain two subgraphs, one 
subgraph corresponding to the node set {1, 2, 4}, and the other subgraph corresponding to the node 
set {1,3,4}. In this example, Uk = 2. Also notice that in this example, there is no edge between 
nodes 2 and 4 in ^ to begin with - so these two nodes will never be found in dispute. 

Parameter is chosen such that 

Pk < Y 

Under the above constraint on p^-/ per ((Sj, execution time of Equality Check is minimized when 
pjt = Under the above constraint on pj^, we will prove the correctness of the Equality Check 
algorithm, with its execution time being L/pjt- 
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3.1 Correctness of the Equality Check Algorithm 

The correctness of Algorithm[T]depends on the choices of the parameter pj^ and the set of coding 
matrices {Cg\e e Sj^}. Let us say that a set of coding matrices is correct if the resulting Equality Check 
Algorithm [T] satisfies the following requirement: 

• (EC) if there exists a pair of fault-free nodes /, / £ such that Xj ^ Xj (i.e., xi ^ Xj), 
then the 1-bit flag at at least one fault-free node is set to MISMATCH. 

Recall that Xj is a vector representation of the L-bit value Xi received by node i in Phase 1 of NAB. 
Two consequences of the above correctness condition are: 



If some node (possibly the source node) misbehaves during Phase 1 leading to outcomes (ii) 
or (iv) for Phase 1, then at least one fault-free node will set its flag to MISMATCH. In this 
case, the fault-free nodes (possibly including the sender) do not share identical L-bit values 
Xi's as the outcome of Phase 1. 

If no misbehavior occurs in Phase 1 (thus the values received by fault-free nodes in Phase 
1 are correct), but MISMATCH flag at some fault-free node is set in Equality Check, then 
misbehavior must have occurred in Phase 2. 

The following theorem shows that when pi^ < Uk/2, and when L is sufficiently large, there 
exists a set coding matrices {Ce|e € S^] that are correct. 

Theorem 1 For pi^ < 11^/2, when the entries of the coding matrices {Ce|e € £j.} in step 1 of Algorithm^are 
chosen independently and uniformly at random from GF{2^IP'<), then {Ce|e 6 £)t} is correct with probability 
> 1 _ [(^"_j:){n - / - l)pk\ . Note that when L is large enough, 1 - 2"^/^* [(„!/)(« - / - l)p)c] > 0. 

Proof Sketch: The detailed proof is presented in Appendix |Cl Here we provide a sketch of the 
proof. The goal is to prove that property (EC) above holds with a non-zero probability. That is, 
regardless of which (up to /) nodes in Q are faulty, when X, Xj for some pair of fault-free nodes 
/ and i in during the k-th instance, at least one fault-free node (which may be different from 
nodes i and j) will set its l-bit flag to MISMATCH. To prove this, we consider every subgraph of 
H e Q)t (see definition of Qf; above). By definition of Q^, no two nodes in H have been found in 
dispute through the first (k — l) instances of NAB. Therefore, H represents one potential set of n - / 
fault-free nodes in ff]^. For each edge e - (i, j) in H, steps 1-2 of Algorithm [T] together have the 
effect of checking whether or not (Xi - Xj)Ce = 0. Without loss of generality, for the purpose of this 
proof, rename the nodes in H as 1, • • • ,n- f. Denote Dj = X; - Xi,_f for z = 1, • • • , (n - / - 1), then 



(Xi - Xj)Ce = o 



'(Di-Dj)Ce = , if i,i<n-f- 

DiCe = , if = n - /; (4) 
-DjCe = , if 2 = n-/. 



Define Dh = [Di, D2, • • • , Dn-f-i]. Let m be the sum of the capacities of all the directed edges 
in H. As elaborated in AppendixO we define Ch to be a (n - / - l)p)t x m matrix whose entries are 
obtained using the elements of Ce for each edge e in H in an appropriate manner. For the suitably 
defined Ch matrix, we can show that the comparisons in steps 1-2 of Algorithm [T] at all the nodes 
in H £ Q/f are equivalent to checking whether or not 

DhCh = 0. (5) 
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We can show that for a particular subgraph H £ 0;^, when pi^ < Uj^/l, in > {n - f - l)pk', and when 
the set of coding matrices {Ce|e e S^} are generated as described in Theorem [ij for large enough 
L, with non-zero probability Ch contains a {n - f - l)p]^ X {n — f - l)p]^ square submatrix that is 
invertible. In this case DhCh = if and only if Dh = 0, i.e., Xi = X2 = • • • = Xn-f . In other words, 
if all nodes in subgraph H are fault-free, and Xi ^ Xj for two fault-free nodes i, j, then DhCh ^ 
and hence the check in step 2 of Algorithm [T] fails at some fault-free node in H. 

We can then show that, for large enough L, with a non-zero probability, this is also simultane- 
ously true for all subgraphs H £ Q;^. This implies that, for large enough L, correct coding matrices 
(Ce for each e e E>]^) can be found. These coding matrices are specified as a part of the algorithm 
specification. Further details of the proof are in AppendixjCj □ 

4 Correctness of NAB 

For Phase 1 (Unreliable Broadcast) and Phase 3 (Dispute Control), the proof that the outcomes 
claimed in Section |2] indeed occur follows directly from the prior literature cited in Section |2] (and 
elaborated in Appendices lAl and IB)) . Now consider two cases: 

• The values received by the fault-free nodes in Phase 1 are not identical: Then the correctness 
of Equality Check ensures that a fault-free node will detect the mismatch, and consequently 
Phase 3 will be performed. As a byproduct of Dispute Control in Phase 3, the fault-free nodes 
will correctly agree on a value that satisfies the validity and agreement conditions. 

• The values received by the fault-free nodes in Phase 1 are identical: If no node announces 
a mismatch in step 2.2, then the fault-free nodes will agree on the value received in Phase 
1. It is easy to see that this is a correct outcome. On the other hand, if some (faulty) node 
announces a mismatch in step 2.2, then Dispute Control will be performed, which will result 
in correct outcome for the broadcast of the k-th instance. 

Thus, in all cases, NAB will lead to correct outcome in each instance. 

5 Throughput of NAB and Capacity of BB 

5.1 A Lower Bound on Throughput of NAB for Large Q and L 

In this section, we provide the intuition behind the derivation of the lower bound. More detail 
is presented in Appendix|Dl We prove the lower bound when the number of instances Q and input 
size L for each instance are both "large" (in an order sense) compared to n. Two consequences of 
L and Q being large: 

• As a consequence of Q being large, the average overhead of Dispute control per instance of 
NAB becomes negligible. Recall that Dispute Control needs to be performed at most /(/ -1- 1) 
times over Q executions of NAB. 

• As a consequence of L being large, the overhead of 1-bit broadcasts performed in step 2.2 of 
Phase 2 becomes negligible when amortized over the L bits being broadcast by the source in 
each instance of NAB. 

It then suffices to consider only the time it takes to complete the Unreliable broadcast in Phase 1 and 
Equality Check in Phase 2. For the k-th instance of NAB, as discussed previously, the unreliable 
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broadcast in Phase 1 can be done in L/yj^ time units (see definition of y/^ in section |2l). We now 
define 

r = { H I H is a subgraph of Q containing source node 1, 

and rnay equal / in some execution of NAB for some k } 

Appendix |E] provides a systematic construction of the set F. Define the minimum value of all 
possible 7jt: 

y* = min y If = min min MIN CUT (Gv, 1, /'). 

g,el' Quel je'V, 

Then an upper bound of the execution time of Phase 1 in all instances of NAB is L/y*. 

With parameter p]^ - LZjc/2, the execution time of the Equality Check in Phase 2 is L/p/^. Recall 
that LJjc is defined as the minimum value of the MINCUTs between all pairs of nodes in all 
undirected graphs in the set Qjt. As discussed in Appendix IC.21 Qj; c Q^, where Qi - Q. Hence 
Ji)t > Jii in all possible Q-^. Define 

p* = ^ = min min _MINCUT{H,i, j). 

HeQj nodes i,j in H 

Then pj^ > p* for all possible arid the execution time of the Equality Check is upper-bounded by 
L/p*. So the throughput of NAB for large Q and L can be lower bounded b}|^ 

lim T{g, L, NAB) > . . . . ^ = (6) 
L^oo L/y* + L/p* y* + p* 

5.2 An Upper Bound on Capacity of BB 

Theorem 2 In any point-to-point network Q{^,£>), the capacity of Byzantine broadcast (Cbb) with node 
1 as the source satisfies the following upper bound 

CBBi&)<mm{y*,2p*). 

AppendixlHpresents a proof of this upper bound. Given the throughput lower bound Tj^^BiQ) 
in ^ and the upper bound on Cbb{Q) from Theorem|2l as shown in Appendix|Gl the result below 
can be obtained. 



Theorem 3 For graph Q{^, £); 

lim T{@,L,NAB) > min{y*,2p*)/3 > Cbb(^)/3. 

L— >CX5 

Moreover, when y* < p*: 

lim T{g,L,NAB) > min(y*,2p*)/2 > Cbb(^)/2. 

L— >oo 

^To simplify the analysis above, we ignored propagation delays. Appendix |D] describes how to 
achieve this bound even when propagation delays are considered. 
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6 Conclusion 



This paper presents NAB, a network-aware Byzantine broadcast algorithm for point-to-point 
networks. We derive an upper bound on the capacity of Byzantine broadcast, and show that 

NAB can achieve throughput at least 1/3 fraction of the capacity over a large number of execution 
instances, when L is large. The fraction can be improved to at least 1/2 when the network satisfies 
an additional condition. 
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Appendices 



A Unreliable Broadcast in Phase 1 

According to [16], in a given graph with yj^ = minye^^ MINCUT{Q]^, 1, /), there always exist a 
set of Yk unit-capacity spanning trees of such that the total usage on each edge e € Sj; by all the 
spanning trees combined is no more than its link capacity Zg. Each spanning tree is "unit-capacity" 
in the sense that 1 unit capacity of each link on that tree is allocated for transmissions on that 



tree. For example. Figure 2(c) shows 2 unit-capacity spanning trees that can be embedded in the 
directed graph in Figure 2(a) one spanning tree is shown with solid edges and the other spanning 
tree is shown in dotted edges. Observe that link (1,2) is used by both spanning trees, each tree 
using a unit capacity on link (1,2), for a total usage of 2 units, which is the capacity of link (1,2). 

To broadcast an L-bit value from source node 1, we represent the L-bit value as symbols, 
each symbol being represented using L/y^. bits. One symbol (L/y^. bits) is then transmitted along 
each of the y/c unit-capacity spanning trees. 



B Dispute Control 

The dispute control algorithm is performed in the fc-th instance of NAB only if at least one node 
misbehaves during Phases 1 or 2. The goal of dispute control is to learn some information about 
the identity of at least one faulty node. In particular, the dispute control algorithm will identify a 
new node as being faulty, or/and identify a new node pair in dispute (at least one of the nodes in 
the pair is guaranteed to be faulty). The steps in dispute control in the k-\h instance of NAB are as 
follows: 



• (DCl) Each node / in "Vj- uses a previously proposed Byzantine broadcast algorithm, such 
as [6J, to broadcast to all other nodes in "V^ all the messages that this node i claims to have 
received from other nodes, and sent to the other nodes, during Phases 1 and 2 of the fc-th 
instance. Source node 1 also uses an existing Byzantine broadcast algorithm ||6l to broadcast 
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its L-bit input for the fc-th instance to all the other nodes. Thus, at the end of this step, all the 
fault-free nodes will reach correct agreement for the output for the k-th instance. 

• (DC2) If for some node pair a,b e "Vj;, a message that node a claims above to have sent to 
node b mismatches with the claim of received messages made by node b, then node pair a, b is 
found in dispute. In step DCl, since a Byzantine broadcast algorithm is used to disseminate 
the claims, all the fault-free nodes will identify identical node pairs in dispute. 

It should be clear that a pair of fault-free nodes will never be found in dispute with each 
other in this step. 

• (DCS) The NAB algorithm is deterministic in nature. Therefore, the messages that should be 
sent by each node in Phases 1 and 2 can be completely determined by the messages that the 
node receives, and, in case of node 1, its initial input. Thus, if the claims of the messages sent 
by some node i are inconsistent with the message it claims to have received, and its initial 
input (in case of node 1), then that node i must be faulty. Again, all fault-free nodes identify 
these faulty nodes identically. Any nodes thus identified as faulty until now (including all 
previous instances of NAB) are deemed to be "in dispute" with all their neighbors (to whom 
the faulty nodes have incoming or outgoing links). 

It should be clear that a fault-free node will never be found to be faulty in this step. 

• (DC4) Consider the node pairs that have been identified as being in dispute in DC2 and DCS 
of at least one instances of NAB so far. 

We will say that a set of nodes Fj, where |f ;| < /, "explains" all the disputes so far, if for each 
pair a, b found in dispute so far, at least one of a and b is in F,. It should be easy to see that 
for any set of disputes that may be observed, there must be at least one such set that explains 
the disputes. It is easy to argue that the nodes in the set below must be necessarily faulty (in 
fact, the nodes in the set intersection below are also guaranteed to include nodes identified 
as faulty in step DCS). 

hp, 

6=1 

Then, "Vjc+i is obtained as "Vk - flsLi f 6- &k+i is obtained by removing from Sj^ edges incident 
on nodes in flsLi f 5/ arid also excluding edges between node pairs that have been found in 
dispute so far. 

As noted earlier, the above dispute control phase may be executed in at most /(/ -I- 1) instances 
of NAB. 

C Proof of Theorem [1] 

To prove Theorem [ij we first prove that when the coding matrices are generated at random 
as described, for a particular subgraph H e Q]^, with non-zero probability, the coding matrices 
{Ce|e e ^k) defines a matrix Ch (as defined later) such that DhCh = if and only if Dh = 0. Then 
we prove that this is also simultaneously true for all subgraphs H e Q^. 

C.l For a given subgraph H eQk 

Consider any subgraph H e Qj.. For each edge e = (/, /) in H, we "expand" the corresponding 
coding matrix Ce (of size pk X Zg) to a (n - / - l)p]^ x Zg matrix Be as follows: Be consists n - / - 1 
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blocks, each block is a pjt x Zg matrix: 

• \{i n- j and j ^ n - f, then the i-th and /-th block equal to Ce and -Ce, respectively. The 
other blocks are all set to 0. 

i i 

Be^= (O-O Ce^ O-O -Ce^ O-O) 

Here ()^ denotes the transpose of a matrix or vector. 

• Ifi = n — f, then the j-th block equals to —Cg, and the other blocks are all set to matrix. 

Be^= (O-O -Ce^ O-O) 

• If i = n — f, then the z-th block equals to Ce, and the other blocks are all set to matrix. 

i 

bJ = (o - o Ce^ ■ o) 

Let D,;^ = X,(/S) - X„_|(jS) for / < n - / as the difference between Xi and Xn_f in the |S-th element. 
Recall that Di = Xi - X„_i = (D^ • • • D,,p^,) and Dh = (Di • • • Dn_f_i). So Dh is a row vector 
of (n - / - l)p;f elements from Gf (2^/^*^) that captures the differences between Xi and X^.f for all 
i <n- f. It should be easy to see that 

(Xi - Xj)Ce = o DnBe = 0. 

So for edge e, steps 1-2 of Algorithm [T] have the effect of checking whether or not DnBe = 0. 

If we label the set of edges in H as el, e2, • • • , and let m be the sum of the capacities of all edges in 
H, then we construct a (n - / - l)p)t X m matrix Ch by concatenating all expanded coding matrices: 

Ch = (Bel Be2 • • •) , 

where each column of Ch represents one coded symbol sent in H over the corresponding edge. 
Then steps 1-2 of Algorithm [l] for all edges in H have the same effect of checking whether or not 
DrCh = 0. So to prove TheoremUl we need to show that there exists at least one Ch such that 

DhCh = o Dh = 0. 

It is obvious that if Dh = 0, then DhCh = for any Ch. So all left to show is that there exists 
at least one Ch such that DhCh = ^ Dh = 0. It is then sufficient to show that Ch contains 
a (n - / - l)pjt X (n - / - l)pj. submatrix Mh that is invertible, because when such an invertible 
submatrix exist, 

DhCh = ^ DhMh = ^ Dh = 0. 

Now we describe how one such submatrix Mh can be obtains. Notice that each column of Ch 
represents one coded symbol sent on the corresponding edge. A (n - / - l)p/f x (n - / - 1) submatrix 
S of Ch is said to be a "spanning matrix" of H if the edges corresponding to the columns of S form 
a undirected spanning tree of H - the undirected representation of H. In Figure [2(d)t an undirected 
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spanning tree of the undirected graph in Figure [2(b)] is shown in dotted edges. It is worth pointing 
out that an undirected spanning tree in an undirected graph H does not necessarily correspond to 
a directed spanning tree in the corresponding directed graph H. For example, the directed edges in 
Figure 2(a) corresponding to the dotted undirected edges in Figure [2(d)] do not consist a spanning 



tree in the directed graph in Figure 2(a) 



It is known that in an undirected graph whose MINCUT equals to U, at least U/2 undirected 
unit-capacity spanning trees can be embedded |16|1 This implies that Ch contains a set of U]^/! 
spanning matrices such that no two spanning matrices in the set covers the same column in Ch- Let 
{Si, • • • , Sp^} be one set of < Uk/2 such spanning matrices of H. Then union of these spanning 
matrices forms an (n - / - l)pj. X {n - f - l)p;f submatrix of Cr: 



Mh = (Si 



Next, we will show that when the set of coding matrices are generated as described in Theorem 
[H with non-zero probability we obtain an invertible square matrix Mh- When Mh is invertible. 



DhMh = o Dh = o Xi 



For the following discussion, it is convenient to reorder the elements of Dh into 



Dh = (Du 



D 



n-f-1,1 



D 



1,2 



D 



n-f-1,2 



D 



D 



n-f-i,Pk 



SO that the (j6 - l)(n - / - 1) -I- 1-th through the jS(n - / - 1) elements of Dh represent the difference 
between Xj (z = 1, • • • , n - / - 1) and X^.f in the ]3-th element- 

We also reorder the rows of each spanning matrix Sq {q - 1, ■ ■ ■ , Pk) accordingly. It can be 
showed that after reordering, Sq becomes Sq and has the following structure: 



AqSq 1 



A„S 



q3q,2 



V-AqSq^p^y 



(7) 



Here Aq is a (n - / - 1) x (n - / - 1) square matrix, and it is called the adjacency matrix of the 
spanning tree corresponding to Sq- Aq is formed as follows. Suppose that the r-th column of Sq 
corresponds to a coded symbol sent over a directed edge {i, j) in H, then 

1. If z ^ n - / and ] + n - j , then the r-th column of Aq has the z'-th element as 1 and the /-th 
element as -1, the remaining entries in that column are all 0; 

2. If z = n - /, then the /-th element of the r-th column of Aq is set to -1, the remaining elements 
of that column are all 0; 

3. \i ) - n - f, then the z-th element of the r-th column of Aq is set to 1, the remaining elements 
of that column are all 0. 



^The definition of embedding undirected unit-capacity spanning trees in undirected graphs is 
similar to embedding directed unit-capacity spanning trees in directed graphs (by dropping the 
direction of edges). 
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For example, suppose H is the graph shown in Figure |2(b)| and Sq corresponds to a spanning 
tree of H consisting of the dotted edges in Figure |2(d)[ Suppose that we index the corresponding 
directed edges in the graph shown in Figure 2(a) | in the following order: (2,3), (1,4), (4,3). The 

I' 1 0^ 
resulting adjacency matrix Aq =10 

1-1 -1 

On the other hand, each (n - / - 1) X (n - / - 1) square matrix Sq^p is a diagonal matrix. The 
r-th diagonal element of Sq^p equals to the p-th coefficient used to compute the coded symbol 
corresponding to the r-th column of Sq. For example, suppose the first column of Sq corresponds 
to a coded packet Xi(l) + 2Xi(2) being sent on link (1,2). Then the first diagonal elements of Sq^ 
and Sq,2 are 1 and 2, respectively. 

So after reordering, Mr can be written as Mh that has the following structure: 

' A1S14 A2S2,1 
AlSi^2 A2S2,2 



Mh 



-^Pk^Pk.Pk 
ApkCpk,pk 



A2S2, 



Pk 



-^Pk^Pk/Pk^ 



(8) 



Notice that Mh is obtained by permuting the rows of Mh. So to show that Mh being invertible is 
equivalent to Mh being invertible. 



(AiS 



Define Mq = 



1,1 



A„S 



AqSq^qy 



for 1 < cj < Note that Mqi is a sub-matrix of Mq2 when 



.AiSi^q 

ql < ql, and Mp^ = Mh. We prove the following lemma: 



Lemma 1 For any pj^ < Uk/2, with probability at least (l 
is also invertible. 



V"^/'/ ' ^'^^^^^ Mh is invertible. Hence Mh 



< 



Pk- 



(n—f—1 \^ 
1 2pPi<~ ) ^ 

The proof is done by induction, with q = 1 being the base case. 
Base Case -q - 1: 

Ml = AiSi,i. (9) 

As showed later in Appendix |C.3[ Aq is always invertible and det(Aq) = +1. Since S14 is a (n -/-!)- 
by-(n - / - 1) diagonal matrix, it is invertible provided that all its (n - / - 1) diagonal elements are 
non-zero. Remember that the diagonal elements of S14 are chosen uniformly and independently 



from GF(2^/P'^). The probability that they are all non-zero is (l - ^lyj;^) > 1 
Induction Step - q to q + 1 < pk- The square matrix Mq+i can be written as 



n-f-l 



Mq+1 = 



M„ 



^q+l3q+l,q+l 



where 



'^Aq+iSq+i^'* 

Aq+lsq+14 



Aq+iSq+i,q 



(10) 



(11) 
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is an (n - / - l)q-by-{n - f - 1) matrix, and 

Fq = (AiSi,k+l ••• AqSq,q+l) (12) 

is an (n - / - l)-by-(n - f - l)q matrix. 

Assuming that Mq is invertible, we transform Mq+i as follows: 



M 



I(n-f-l)q 



+1/ \ I(n-f-l) / 



I(n-f-l)q \/Mq Pq \/l(n-f-l)q "Mq^Pq 



A-1 If A s n T ' (14) 

" ^q+l/\Fq Aq+ibq+i,q+i/ \ l(n-f-l) / 



Mq 
AqliFq Sq,i,q,i - A^l.FqMqlp, 



(15) 



Here I(n-f-i)q and I(n-f-i) each denote a (n - f - l)q x {n - f - l)q and a (n - / - 1) x (n - / - 1) 
identity matrices. Note that | det(M^^^)| = | det(Mk+i)|, since the matrix multiplied at the left has 
determinant +1, and the matrix multiplied at the right has determinant 1. 

Observe that the diagonal elements of the (n - / - 1) x (n - / - 1) diagonal matrix Sq+i^q+i are 
chosen independently from A^^^FqMq^Pq. Then it can be proved that Sq+i^q+i - A^^^FqMq^Pq is 

invertible with probability at least 1 - -^17^ (See AppendixjCH) given that Mq is invertible, which 

(n—f—l \^ 
1 — ^Tj]^) according to the induction assumption. So we have 

/ n-f-l\U n-f-l\ I n-/-l\'+^ 
Pr{Mq,, is invertible} > [l - -J^ ) (l - -J^ ) = (l - -J^ ) . (16) 

This completes the induction. Now we can see that Mpj^ = Mh is invertible with probability 
^ n-/-lf ^ ^ (n-/-l)p, 



1 > 1 - ^ ' ^ 1, as L ^ cxD. (17) 

□ 

Now we have proved that there exists a set of coding matrices {Ce |e € S^} such that the resulting 
Ch satisfies the condition that DhCh = if and only if Dh = 0. 

C.2 For all subgraphs in Qjt 

In this section, we are going to show that, for 0]^, if the coding matrices {Ce|e 6 £]^] are generated 
as described in Theorem[Tl then with non-zero probability the set of square matrices {Mh|H e Q/J 
are all invertible simultaneously. When this is true, there exists a set of coding matrices that is 
correct. 

To show that Mh's for all H e Q.]^ are simultaneously invertible with non-zero probability, we 
consider the product of all these square matrices: 



J-feQi. 



According to Lemma [H each Mh (H € Q/^ ) is invertible with non-zero probability. It implies that 
det(MH) is a non-identically-zero polynomial of the random coding coefficients of degree at most 



17 



{n- f -l)pi (Recall that Mr is a square matrix of size (n - / - l)pj..)- So 



det 



[] Mh = [] det(MH) 



Hen.. 



HeQi- 



is a non-identically-zero polynomial of the random coefficients of degree at most |Oj:I(m ~ / ~ ^)Pk- 
Notice that each coded symbol is used once in each subgraph H. So each random coefficient 
appears in at most one column in each Mh- It follows that the largest exponent of any random 
coefficient in det^nHen^ Mh) is at most |Qj-|. 

According to Lemma 1 of the probability that det^HHeQ,. Mh) is non-zero is at least 

(l - 2-^/P'10.l)'""^"'^'^ > 1 - [|Q,|(n - / - Dp,] . 

According to the way Q]^ is constructed and the definition of Qjt, it should not be hard to see that 
Q\ is a subgraph of Qi - Q, and c Q^. Notice that |Qi| = So |Qjc| < („"|) and Theorem[T] 

follows. 

C.3 Proof of Aq being Invertible 

Given an adjacency matrix Aq, let us call the corresponding spanning tree of H as T^. For edges 
in Tq incident on node n - f, the corresponding columns in Aq have exactly one non-zero entry. 
Also, the column corresponding to an edge that is incident on node i has a non-zero entry in row 
i. Since there must be at least one edge in that is incident on node n — f, there must be at least 
one column of Aq that has only one non-zero element. Also, since every node is incident on at 
least one edge in Tq, every row of Aq has at least one non-zero element(s). Since there is at most 
one edge between every pair of nodes in T^, no two columns in Aq are non-zero in identical rows. 
Therefore, by column manipulation, we can transform matrix Aq into another matrix in which 
every row and every column has exactly one non-zero element. Hence det(Aq) equals to either 1 
or -1, and Aq is invertible. 

C.4 Proof of Sq+i,q+i - Aq^^FqMqiPq being Invertible 

Consider W be an arbitrary fixed w xw matrix. Consider a random w xw diagonal matrix S 
with w diagonal elements si, • • • , Sa,. 



S = 



(si 

S2 

•■• 



0^ 




s 



(18) 



The diagonal elements of S are selected independently and uniformly randomly from Gf (2^*^). 
Then we have: 

Lemma 2 The probability that the wxw matrix S - W zs invertible is lower bounded by: 



w 



Pr{(S - W) is invertible} > 1 - — . 



(19) 
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Proof: Consider the determinant of matrix S - W. 



det(S-W) = det 



(si - Wi,i) -Ri,2 

-Rl,! (S2 - W2,2) 



(Si-Wi,i)(S2-W2,2)---(s. 



-R2,w 

^w,zv) + other terms 



(20) 

(21) 
(22) 



The first term above, llj^^s,, is a degree-a; polynomial of si, • • • ,Szv V^zu- is a polynomial of degree at 
most a; - 1 of Si, • • • ,Siv, and it represents the remaining terms in det(S - W). Notice that det(S - W) 
cannot be identically zero since it contains only one degree-w term. Then by the Schwartz-Zippel 
Theorem, the probability that det(S - R) = is < w/2P'' . Since S - W is invertible if and only if 
det(S - W) 0, we conclude that 



w 



Pr{(S - W) is invertible} > 1 - — 



(23) 



By setting S = Sq+i^q+i, W = A^^^FqlMq^Pq, and a; = n - / - 1, we prove that Sq+i,q+i - 



n-f-l 



A^^jFqMq^Pq is invertible with probability at least 1 - 

D Throughput of NAB 

First consider the time cost of each operation in instance k of NAB : 



□ 



• Phase 1: It takes L/yk < L/y* time units, since unreliable broadcast from the source node 1 at 
rate y^ is achievable and y^ > y*, as discussed in Appendix lAl 

• Phase 2 - Equality check: As discussed previously, it takes L/p/; < L/p* time units. 

• Phase 2 - Broadcasting outcomes of equality check: To reliably broadcast the 1-bit flags 
from the equality check algorithm, a previously proposed Byzantine broadcast algorithm, 
such as [6], is used. The algorithm from [6J, denoted as Broadcast_Def ault hereafter, reliably 
broadcasts 1 bit by communicating no more than P(n) bits in a complete graph, where P(n) is a 
polynomial of n. In our setting, Q might not be complete. However, the cormectivity of Q is 
at least 2/ + 1. It is well-known that, in a graph with connectivity at least If + 1 and at most 
/ faulty nodes, reliable end-to-end communication from any node i to any other node ; can be 
achieved by sending the same copy of data along a set oilf+l node-disjoint paths from node 
i to node / and taking the majority at node /. By doing this, we can emulate a complete graph 
in an incomplete graph Q. Then it can be showed that, by running BroadcastJDefault on 
top of the emulated complete graph, reliably broadcasting the 1-bit flags can be completed 
in 0{n") time units, for some constant a > 0. 

• Phase 3: If Phase 3 is performed in instance k, every node i in "Vj^ uses BroadcastJDefault 
to reliably broadcast all the messages that it claims to have received from other nodes, and 
sent to the other nodes, during Phase 1 and 2 of the k-th instance. Similar to the discussion 
above about broadcasting the outcomes of equality check, it can be showed that the time it 
takes to complete Phase 3 is O(Ln^) for some constant j6 > 0. 
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L/y* for Phase 1 



L/p*+0(n")for Phase 3 



Figure 3: Example of pipeUning 



Now consider a sequence of Q > instances of NAB. As discussed previously. Phase 3 will 
be performed at most /(/ + 1) times throughout the execution of the algorithm. So we have the 
following upper bound of the execution time of Q instances of NAB: 



f(^, L, Q, NAB) < Q I i- + A + o(n«) ) + /(/ + l)O(LnP). 



Then the throughput of NAB can be lower bounded by 



T{Q,L,NAB) = lim 



Q^^ mL,Q,NAB) 



> lim — 

e-'" Q (7 + ^ + 0{n«)) + /(/ + l)0(LnO 

> hm I — + -j-^ + M (smce / < n/3) 



Q->co\ y*p* 



(24) 

(25) 
(26) 

(27) 



Notice that for a given graph ff, [n, y* , p* , a, §1} are all constants independent of L and Q. So for 
sufficiently large values of L and Q, the last two terms in the last inequality becomes negligible 
compared to the first term, and the throughput of NAB approaches to a value that is at least as 
large as Tnab/ which is defined 

y*p* 



(28) 



In the above discussion, we implicitly assumed that transmissions during the unreliable broad- 
cast in Phase 1 accomplish all at the same time, by assuming no propagation delay. However, when 
propagation delay is considered, a node cannot forward a message/symbol until it finishes receiv- 
ing it. So for the /c-th instance of NAB, the information broadcast by the source propagates only one 
hop every L/yjt time units. So for a large network, the "time span" of Phase 1 can be much larger 
than L/y)f This problem can be solved by pipelining: We divide the time horizon into rounds 
of (p- -I- ^ -H 0(n")) time units. For each instance of NAB, the L-bit input from the source node 1 
propagates one hop per round, using the first L/y* time units, until Phase 1 completes. Then the 
remaining -I- 0{n")^ time units of the last round is used to perform Phase 2. An example in 
which the broadcast in Phase 1 takes 3 hops is shown in Figure IH By pipelining, we achieve the 
lower bound from EqlH 
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E Construction of T 



A subgraph of Q belonging to T is obtained as follows: We will say that edges in W c £ are 
"explainable" if there exists a set F ci'V such that (i) F contains at most / nodes, and (ii) each edge 
in W is incident on at least one node in F. Set F is then said to "explain set W". 

Consider each explainable set of edges W c 6. Suppose that Fi, • • • , are all the subsets of 
that explain edge set W. A subgraph ^vv of ^ is obtained by removing edges in W from £, and 
nodes in fl^Li F5 from In general, above may or may not contain the source node 1. Only 
those Ww's that do contain node 1 belongs to T. 

F Proof of Theorem |2] 

In arbitrary point-to-point network £>), the capacity of the BB problem with node 1 being 
the source and up to / < n/3 faults satisfies the following upper bounds 

F.l Cbb(^)</ 

Proof: Consider any ^ F and let W is the set of edges in Q but not in By the construction 
of r, there must be at least one set F c that explains W and does not contain the source node 1. 
We are going to show that Cbb{Q) < MINCUT{Ww, 1, /) for every node i + 1 that is in 

Notice that there must exist a set of nodes that explains W and does not contain node 1; 
otherwise node 1 is not in ^vv- Without loss of generality, assume that F\ is one such set nodes. 

First consider any node z 1 in but i^F\. Let all the nodes in Fx be faulty such that they 
refuse to communicate over edges in W, but otherwise behave correctly. In this case, since the 
source is fault-free, node i must be able to receive the L-bit input that node 1 is trying to broadcast. 

So Cbb(^) < MlNCmi^^v^, 1, i). 

Next we consider a node z ^ 1 in and / € Fi . Notice that node / cannot be contained in all 
sets of nodes that explain W, otherwise node / is not in ^vv- Then there are only two possibilities: 

1. There exist a set F that explaining W that contains neither node 1 nor node i. In this case, 
Cbb{0) ^ MINCUT{Ww, 1/ i) according to the above argument by replacing Fi with F. 

2. Otherwise, any set F that explains W and does not contain node i must contain node 1. Let 
F2 be one such set of nodes. 

Define V~ = 'V — Fi — F2. V~ is not empty since Fi and F2 both contain at most / nodes and 
there are n > 3f + 1 nodes in "V. Consider two scenarios with the same input value x: (1) 
Nodes in Fi (does not contain node 1) are faulty and refuse to communicate over edges in W, 
but otherwise behave correctly; and (2) Nodes in F2 (contains node 1) are faulty and refuse 
to communicate over edges in W, but otherwise behave correctly. In both cases, nodes in V~ 
are fault-free. 

Observe that among edges between nodes in V~ and Fj U F2, only edges between V~ and 
Fi n F2 could have been removed, because otherwise W cannot be explained by both Fi and 
F2. So nodes in V~ cannot distinguish between the two scenarios above. In scenario (1), the 
source node 1 is not faulty. Hence nodes in V~ must agree with the value x that node 1 is 
trying to broadcast, according to the validity condition. Since nodes in V~ cannot distinguish 

"It is possible that Ww for different W may be identical. This does not affect the correctness of 
our algorithm. 
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between the two scenarios, they must also set their outputs to x in scenario (2), even though 
in this case the source node 1 is faulty. Then according to the agreement condition, node i 
must agree with nodes in V~ in scenario (2), which means that node / also have to learn x. 
So Cbb{&) < MINCUTQ¥w, 1, 0- 

This completes the proof. □ 
F.2 Cbb{Q)<1p' 

Proof: For a subgraph H e Qi (and accordingly H e Qi), denote 

Uh= min MINCUT{H,i, j). 

nodes i,j in H 

We will prove the upper bound by showing that Cbb(G) < Uh for every H e Qi. 

Suppose on the contrary that Byzantine broadcast can be done at a rate R > Uh + e for some 
constant e > Q. So there exists a BB algorithm, named J?l, that can broadcast t{llu + e) bits in using 

t time units, for some f > 0. 

Let E be a set of edges in H that corresponds to one of the minimum-cuts in H. In other words, 
LeeE = Uh, and the nodes in H can be partitioned into two non-empty sets X and 'R such that 
X and 'R are disconnected from each other if edges in E are removed. Also denote F as the set of 
nodes that are in Q but not in H. Notice that since H contains {n- f) nodes, F contains / nodes. 

Notice that in t time units, at most tUn < KUh + e) bits of information can be sent over edges in 
E. According to the pigeonhole principle, there must exist two different input values of t{UH + e) 
bits, denoted as u and v, such that in the absence of misbehavior, broadcasting u and v with 
algorithm J?l results in the same communication pattern over edges in E. 

First consider the case when F contains the source node 1. Consider the three scenarios using 
algorithm 

1. Node 1 broadcasts u, and none of the nodes misbehaves. So all nodes should set their outputs 
to u. 

2. Node 1 broadcasts v, and none of the nodes misbehaves. So all nodes should set their outputs 
to V. 

3. Nodes in F are faulty (includes the source node 1). The faulty nodes in F behave to nodes in 
X as in scenario 1, and behave to nodes in 7? as in scenario 2. 

It can be showed that nodes in X cannot distinguish scenario 1 from scenario 3, and nodes in 7? 
cannot distinguish scenario 2 from scenario 3. So in scenario 3, nodes in X set their outputs to u 
and nodes in "R set their outputs to v. This violates the agreement condition and contradicts with 
the assumption that J?l solves BB at rate Uh + £■ Hence Cbb{Q) ^ Uh- 

Next consider the case when F does not contain the source node 1. Without loss of generality, 
suppose that node 1 is in X. Consider the following three scenarios: 

1. Node 1 broadcasts u, and none of the nodes misbehaves. So all nodes should set their outputs 

to u. 

2. Node 1 broadcasts v, and none of the nodes misbehaves. So all nodes should set their outputs 
to V. 
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3. Node 1 broadcasts u, and nodes in F are faulty. The faulty nodes in F behave to nodes in £. 
as in scenario 1, and behave to nodes in as in scenario 2. 



In this case, we can also show that nodes in X cannot distinguish scenario 1 from scenario 3, and 
nodes in cannot distinguish scenario 2 from scenario 3. So in scenario 3, nodes in X set their 
outputs to u and nodes in set their outputs to v. This violates the agreement condition and 
contradicts with the assumption that J?l solves BB at rate Uh + £■ Hence Cbb{Q) ^ Uh, and this 
completes the proof. □ 

G Proof of Theorem ID 

Now we compare Tj~^ab{Q) with the upper bound of Cbb{Q) from Theorem El Recall that 

Tnab{Q) = — — — : 

7 + p 

and 

Cbb{Q) < rmn{y,2p*). 

There are 3 cases: 



1. y* < p*. Observe that Tf^^BiQ) is an increasing function of both y* and p*. For a given y*, it is 
minimized when p* is minimized. So 

7* + 7* 2 2 
The last inequality is due to y* > Cbb{&)- 

2. 7* < 2p*: 

T (r\ ■> ^'P* ^* Cbb(^) , . 

^NABly-j > — : = ^ > — ^ — • (30) 
2p + p 3 3 

The last inequality is due to y* > Cbb(^). 

3. 7* > 2p*: Since Tj^ABiQ) is ^ri increasing function of both y* , for a given p* , it is minimized 
when 7* is minimized. So 

^NAB(y-) > — : - > — z — • (31) 
2p* + p* 3 3 

The second inequality is due to 2p* > Cbb{@)- 
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